Calculating Statistical Similarity between Sentences
نویسندگان
چکیده
Sentence similarity plays an important role in text-related research and applications. It is closely related to word similarity and document similarity. The statistical similarity measures between sentences, based on symbolic characteristics and structural information, could measure the similarity between sentences without any prior knowledge but only on the statistical information of sentences. This paper presents several approaches to calculating statistical similarity between sentences on a test corpus of 40 sentences. These measures can be used in short text related applications such as corpus construction and title/abstract based document recommendation. The evaluation results show the differences of these measures.
منابع مشابه
A Sentence Semantic Similarity Calculating Method Based on Segmented Semantic Comparison
In order to calculate sentence semantic similarity more accurately, a sentence semantic similarity calculating method based on segmented semantic comparison was proposed. Sentences would be divided into the trunk and the other segments by some grammar rules, and each segment might be divided into several shorter segments. When calculating the sentence semantic similarity between two sentences, ...
متن کاملCalculating the similarity between words and sentences using a lexical database and corpus statistics
Calculating the semantic similarity between sentences is a long dealt problem in the area of natural language processing. The semantic analysis field has a crucial role to play in the research related to the text analytics. The semantic similarity differs as the domain of operation differs. In this paper, we present a methodology which deals with this issue by incorporating semantic similarity ...
متن کاملMeasuring the sentence level similarity
This article describes a method used to calculate the similarity between short English texts, specifically of sentence length. The described algorithm calculates semantic and word order similarities of two sentences. In order to do so, it uses a structured lexical knowledge base and statistical information from a corpus. The described method works well in determining sentence similarity for mos...
متن کاملDERI&UPM: Pushing Corpus Based Relatedness to Similarity: Shared Task System Description
In this paper, we describe our system submitted for the semantic textual similarity (STS) task at SemEval 2012. We implemented two approaches to calculate the degree of similarity between two sentences. First approach combines corpus-based semantic relatedness measure over the whole sentence with the knowledge-based semantic similarity scores obtained for the words falling under the same syntac...
متن کاملCalculating Semantic Similarity between Academic Articles using Topic Event and Ontology
Determining semantic similarity between academic documents is crucial to many tasks such as plagiarism detection, automatic technical survey and semantic search. Current studies mostly focus on semantic similarity between concepts, sentences and short text fragments. However, document-level semantic matching is still based on statistical information in surface level, neglecting article structur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011